Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State when the runtime should and must not apply Linux ambient capabities #668

Closed

Conversation

justincormack
Copy link
Contributor

Ambient capabilities are a feature, since Linux 4.3, that enables capabilities
to be set on non root proesses directly. They are the only way to set these,
so are desirable in the case of "no new privileges" where suid binaries or
filesystem capabilities cannot be used as tis flag denies these operations,
and therefore there is no other way to apply capabilities to non root processes.

Without "no new privileges" users generally expect suid binaries or
filesystem capabilities to be the way to grant capabilities to non
root processes.

See opencontainers/runc#1286 for the runc pull
and detailed explanation of the security issues in offering a choice here.

Signed-off-by: Justin Cormack justin.cormack@docker.com

…lities

Ambient capabilities are a feature, since Linux 4.3, that enables capabilities
to be set on non root proesses directly. They are the only way to set these,
so are desirable in the case of "no new privileges" where suid binaries or
filesystem capabilities cannot be used as tis flag denies these operations,
and therefore there is no other way to apply capabilities to non root processes.

Without "no new privileges" users generally expect suid binaries or
filesystem capabilities to be the way to grant capabilities to non
root processes.

See opencontainers/runc#1286 for the `runc` pull
and detailed explanation of the security issues in offering a choice here.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
@lucab
Copy link

lucab commented Jan 23, 2017

Related to this topic, I think that in the pre-existing wording the capabilities field is a bit under-specified. For example, it doesn't spell out which set(s) of capabilities it affects.

More generally, this field can be read both as some upper limit that a process is allowed to reach (ie. focused on the bounding set) or as starting privileges that the process will already have (ie. focused on the ambient/effective set). This PR seems to go in the latter direction, so I think it's the right time to clarify the semantics.

@justincormack
Copy link
Contributor Author

@lucab yes that is a good point. It has always tried to do both, well at least runc has, and so what this spec was based from, ie it sets the bounding set, and then sets effective and inheritable to the same.

@justincormack
Copy link
Contributor Author

I am happy to add those changes here, and retitle this "clarify the capabilities spec"...

@mrunalp
Copy link
Contributor

mrunalp commented Jan 23, 2017

@justincormack WDYT of adding a separate boolean for ambient capabilities? That way the config generator has the option to do what it wants rather than tying this to noNewPrivileges?

@justincormack
Copy link
Contributor Author

@mrunalp yes that is an option, but I do think this version should cover all the use cases, without complicating the spec.

If you set NNP you must want ambient applied (well, at least if you choose a non root user, it makes no real difference in the root case) because otherwise you dont actually get the capabilities you asked for and have no way to get them under any circumstance, so it definitely makes sense here.

If you don't have NNP set, you are implying that you may want to raise capabilities, but if you set ambient caps you dont need to raise privs as you already have them.

So:
NNP=true ambient=false pointless, get no caps
NNP=true ambient=true sensible
NNP=false ambient=false sensible, historic behaviour
NNP=false ambient=true effectively same as NNP=true ambient=true

It is true that the NNP flag has some other effects (unpriv seccomp) but I don't think they affect this much.

For capability aware software, the nnp+ambient setting makes the most sense, and I would try to encourage people to use it - we have a PR for docker to have a daemon setting to make it the default now moby/moby#29984 and I am just working on a PR to go in conjunction with the runc PR that will reduce non root default capabilities in the nnp case. I also plan to improve the explanation for what NNP does.

For people running applications that use sudo and expect that to restrict caps, any ambient setting is a real problem, as it gives unprivileged processes root-like capabilities.

If we were to add a flag, I would set it to the same as NNP in Docker, as it is too confusing to have another option, and explain what ambient caps are. We cannot change the way sudo works for existing users (who of necessity are not using NNP).

@justincormack
Copy link
Contributor Author

@mrunalp another option is to simply expose the four capabilities sets in Linux, and make the user specify exactly what they want, but no one has asked for this, and it is very complex.

justincormack added a commit to justincormack/docker that referenced this pull request Jan 24, 2017
This is a rework of support for ambient capabilities, to avoid
the issues in the previous version, where there was a conflict
between two use cases, programs that want to use sudo and programs
that want to grant unprivileged users direct capabilities.

If you do not use the `--security-opt no-new-privileges` flag,
nothing changes with this patch. `sudo`, suid binaries and filesystem
capabilities elevate privileges, and non root users can only use
privileges via these mechanisms as on a normal Linux userspace.

With the `no-new-privileges` flag, the kernel does not allow caps
to be granted via suid binaries, so it is assumed that the user wants
to be granted capabilities directly, so ambient capabilities are
granted. For root this makes little difference, but for a normal
user this means that they can be granted capabilities directly, so
that privileged operations can be performed directly.

As previously no capabilities were granted to a non root user with
`no-new-privileges`, we take the opoprtunity to reduce the default
capability set in this case to only the three safest capabilities:
`CAP_KILL`, `CAP_AUDIT_WRITE` and `CAP_NET_SERVICE`. Other capabilities
must be granted with `--cap-add`.

`runc` commit is in opencontainers/runc#1286
Spec commit is in opencontainers/runtime-spec#668

fix moby#8460

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
@lucab
Copy link

lucab commented Jan 24, 2017

As you seem to be still in brainstorming mode, another idea is to follow the split you described in your other PR (modern vs traditional apps) and having two sets to describe:

  1. the maximum set of possible caps later usable by the process (via fcaps, setuid-bit, capset)
  2. the actual caps pre-activated by the parent for the process (which can then avoid caps manipulation)

This won't break backward compatibility (the first set is the one already in this spec, without runc build tag) and will allow for some more flexibility at the low level (eg. mixing NNP=0 and still having a defined upper bound and optional starting caps as well). The only invariant is that the first set must be a superset of the second set.

@justincormack
Copy link
Contributor Author

justincormack commented Jan 24, 2017

@lucab if NNP is defined it is not that much use the initial process having a broader bounding set than current set, as if it has CAP_SETPCAP it can raise, and if not it may as well have the initial set as the bounding set. With either mode it is really only useful if an initial process drops caps for sub processes anyway, so it has less privileged children, or it drops caps when it does not need them, but there is no real use case for the initial process not being fully privileged that I can see.

With the non NNP mode, well you would just use fscaps, and make the initial process non root.

There has never been a call for splitting up and making the caps more complicated, and I think it is because there is really no reason to.

@Lucan
Copy link

Lucan commented Jan 24, 2017

Its @lucab and not lucan (me) ;)

@cyphar
Copy link
Member

cyphar commented Jan 24, 2017

I don't really like tying up NNP with ambient capabilities. To be honest, I would prefer if we switch to something like capsh's format with the different bits (eipa). That would then be a minimally complicated modification -- the downside being that it is no longer POSIX capabilities capable (no pun intended).

NNP=true ambient=false pointless, get no caps

But it isn't pointless. If you're running a root process that then switches UID/GID and then executes another process, then this change will mean that the new process now has the full capability set.

I do agree that the usual case is "pointless" with an unprivileged user, but actually you could implement this in several ways that makes it not pointless (not that I'm suggesting you should implement it by overwriting the process's code with ptrace(2)).

I understand why you'd want to do this, but NNP has a very specific meaning in Linux -- it's a bit unfortunate that it isn't in the Linux-specific section so it makes it more prone to having its semantics be changed like this.

@justincormack
Copy link
Contributor Author

@cyphar yes I guess that the NNP, root, ambient case is potentially a bit confusing for a program that expects to be able to drop caps by just changing uid.

The capsh format is pretty horrible, I have not yet met someone who can understand it, and it does not really make sense to have a microformat like that in a JSON spec. We could just have the four arrays though.

POSIX capabilities don't exist in any meaningful sense - POSIX dropped the spec and the only similar implementation is the Linux one.

@mrunalp mrunalp added this to the 1.0.0 milestone Feb 1, 2017
@mrunalp
Copy link
Contributor

mrunalp commented Feb 1, 2017

We discussed this on the OCI call today and we think that having a separate boolean for ambient caps would be preferable.

@dqminh
Copy link
Contributor

dqminh commented Feb 1, 2017

@mrunalp hmm, but the ambient capability set can also only be a subset of the process' bounding set right ? In that case specifying only a boolean feels restrictive. Maybe adopting current systemd configuration i.e. 1 list for bounding set, 1 list for ambient also works here ?

@justincormack
Copy link
Contributor Author

justincormack commented Feb 1, 2017 via email

@crosbymichael
Copy link
Member

So we should have this then?

{
    "capabilities" : [],
    "ambient_capabilities": []
}

@justincormack
Copy link
Contributor Author

@crosbymichael it is more logical to have

{
    "capabilities": {
          "bounding": [],
          "ambient": []
     }
}

but obviously that is less backward compatible. It is more forward compatible if we want to add the other two sets though.

@crosbymichael
Copy link
Member

i'm fine changing it to something move obvious while we are still pre 1.0 if everyone else is good with it. its not like we haven't changed the types of things 1000 times

@mrunalp
Copy link
Contributor

mrunalp commented Feb 1, 2017

@justincormack @crosbymichael SGTM

@crosbymichael crosbymichael self-assigned this Feb 2, 2017
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 2, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 2, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 2, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 2, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 2, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 3, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
@justincormack
Copy link
Contributor Author

Closing in favour of #675

crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 8, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 8, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 8, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 8, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 9, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 9, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 13, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 17, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 17, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 22, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
crosbymichael added a commit to crosbymichael/specs that referenced this pull request Feb 22, 2017
Closes opencontainers#668

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants